
The very first thing you should know is that I will be using the POST method of handling data (defined under FORM in the HTML code). The other method is the GET method, I don't use it because the amount of data that can be retrieved from a GET is considerably smaller than what you can get from a POST. POST passes data to the CGI program using a a stream (in the program, stdin). I must assume you understand enough C to understand this as I do not have the time to write a C tutorial to explain it. The data from the GET method can be accessed through an environment variable called QUERY_STRING. An environment variable is a variable that stores data external to the programs running on an operating system (they exist at the level of the shell). A good example of an environment variable is the familiar DOS PATH (which was copied from UNIX).
The GCC (GNU C++) compiler can be found on a wide range of UNIX systems, and I suggest you use it. It is an ANSI compliant compiler which has been distributed as public domain software and can be found packaged with many free Unix clones (such as Linux). One of the questions I most frequently get asked (and one I asked myself when I first began) is "Can I compile this on my local machine and upload it to the server?" The answer to this is no. Unless you are using the same version of Unix or Linux as your ISP as your development machine, it will not work. Operating systems are very different from each other, from the way their code compiles to the way it executes. There are many flavors of UNIX, and while a few are compatible, most you may come across are not. Wintel (Intel machines running windows) is very different from UNIX, and no executable you compile on it will run on a UNIX platform. This can only be achieved using some kind of cross-platform compiler (I have heard of these, but their price is generally forbidding enough to forget about taking that path). It all boils down to one simple fact. A CGI compiled on the platform is is to be used will always work better than one you try to compile elsewhere.
main()
{
double contentlength;
const char *len1 = getenv("CONTENT_LENGTH");
contentlength=strtol(len1, NULL);
}
That code excerpt will get the number of bytes to read
from the server through the environment variable "CONTENT_LENGTH". It will place
the number in a double variable called contentlength. Make sure it's at least a
double variable because you might be reading a LOT of data, and so the number of
bytes can get pretty big. Now you have to get the data from the server into one
LONG string, the following code is a combination of the previous code and the
code to read from the server. Just start reading from stdin until the length
specified in "CONTENT_LENGTH" is reached. There is no need to tell stdin to
point to where the data is, just start reading from it. The code goes as
follows:
main()
{
char *endptr;
int i;
double contentlength;
char buff[10000];
const char *len1 = getenv("CONTENT_LENGTH");
contentlength=strtol(len1, &endptr, 10);
fread(buff, contentlength, 1, stdin);
}
You now have the URL encoded data from the whole form
in one variable, I named it buff[] (as in a data buffer). Now that it has
stopped reading from the server, your CGI has all of the information the user
just submitted to your form.
Var1=Two&name=Odysseus&address=Calypso%27s+island
I used this code to write the CGI you used for the form above: #include <stdio.h>
#include <stdlib.h>
#include <string.h>
main()
{
char *endptr;
int i;
double contentlength;
char buff[10000];
char a,b;
const char *len1 = getenv("CONTENT_LENGTH");
contentlength=strtol(len1, &endptr, 10);
fread(buff, contentlength, 1, stdin);
printf("Content-type: text/html\n\n%s",buff);
}
Ok, now you need a brief rundown of how URL encoding works. All of the
variables provided in the form (named Var1, name, and address) are set equal to
the data they will return, each variable separated by an & (ampersand). All
spaces, as I have pointed out before, have been removed and replaced with +'s
(plus symbols). Any extraneous ASCII data is converted to hex, as is the '
(apostrophe) in "Calypso's island". Here is a small chunk of code that will
decode all of the hex and put the URL string, with the hex and spaces decoded
into a variable called buff2[]. The string with the original URL encoded is
called buff[]. Please take note of the comment "/* Prevent user from altering
URL delimiter sequence */". This means that if a user types an & (ampersand)
or an = (equals sign) into your program, it won't mess up the natural delimiting
structure of the URL encoding scheme. This also means, on the downside, that
you'll have to pass each separate variable string through another loop similar
to this one to change the &'s and ='s back into their respective characters.
for (x = 0, y = 0; x < strlen(buff); x++, y++)
{
switch (buff[x])
{
/* Convert all + chars to space chars */
case '+':
buff2[y] = ' ';
break;
/* Convert all %xy hex codes into ASCII chars */
case '%':
/* Copy the two bytes following the % */
strncpy(hexstr, &buff[x + 1], 2);
/* Skip over the hex */
x = x + 2;
/* Convert the hex to ASCII */
/* Prevent user from altering URL delimiter sequence */
if( ((strcmp(hexstr,"26")==0)) || ((strcmp(hexstr,"3D")==0)) )
{
buff2[y]='%';
y++;
strcpy(buff2,hexstr);
y=y+2;
break;
}
buff2[y] = (char)strtol(hexstr, NULL, 16);
break;
/* Make an exact copy of anything else */
default:
buff2[y] = buff[x];
break;
}
}That's about all there is to URL encoding. Also
note that if you decide you want a CheckBox on your form that if a user does not
check it neither it's name, nor a tag indicating that it is unchecked are
included in the URL encoded string. It will appear as if it doesn't exist at
all. You must code around this fallacy (or stupidity in my own terms) and find
your own solution to the problem.
printf("Content-type: text/html\n\n This is a sample message.\n");
You must have "Content-type: text/html\n\n" INCLUDING the two \n's those are crucial in letting the server know you are about to send data to the user. You can also use HTML code in the message, like so:
printf("Content-type: text/html\n\n <I> This would appear in Italics </I>.");
printf("Location: http://yourdns.yoursite.com/~whoeveryouare/page.html\n\n");
Again, the two \n's on the end are crucial in executing the command, don't forget them. Another little fallacy of CGI that I should report is that you must either redirect the user to another page, or print something to the console. The important point here is that the CGI must output SOMETHING. It doesn't matter if you print to their console a message like "Operation complete, please return to..." or if you simply redirect them back to your homepage, all that is important is that you output some data to the user. Failure to follow this simple principle may cause you hundreds of "Server Error 500" messages until you realize your error. This has happened to me on more than one occasion, bad habits are hard to break!
gcc -o youroutputcgi.cgi yourscript.c
That will take the source code you wrote (the ".c" or ".cpp" file) and compile it into a cgi program called "youroutputcgi.cgi", replace the name for the file with whatever aesthetic name pleases you. Just be sure it is a ".cgi" file. If it still doesn't seem to work, contact your ISP because he/she may have the system configured so that only one extention (like .CGI or .EXE) can be executed. Also, if you have a C/C++ ANSI compiler locally (on your own computer) then you may want to try compiling it before compiling on the UNIX server. You can use this to catch those stupid little errors in your program that make you want to hit your computer (or yourself) repeatedly with your fluffy stress bat. But you must still remember that the executable you make on your machine WILL NOT run on a unix machine, you have to compile it on the UNIX server.
You MUST already have any output files created, or file creation priveleges set for the directory so that the program can create it's own. Whichever you choose, one Unix command will do the trick. To open up full read, write, and execute access to a file or directory use:
CHMOD ugo=rwx
yourfile.html
or
CHMOD
ugo=rwx yourdirectory
The first will allow your output file (if you want to create an HTML document on the fly) full access, and the second will make a directory full access so you can create outputable files from inside the program. The ugo=rwx breakdown is as follows:
u=user - yourself, how much
access do you want to your own file :)
g=group - all of the members of your
group
o=other - internet
surfers, those you intend the data to reach.
r=read - read access to a
file allows a user to read what's in the file
w=write - write access is for files
that must be written to (like an output file)
x=execute - you must set the execute
status on your script to run it, like the following:
CHMOD ugo=x yourcgi.cgi
Give whatever access to the file or directory you deem neccessary, after all you can create a security breach if you open up all of your files to write access. Someone could overwrite YOUR code! Also you do not have to set the user, group, and other attributes at the same time, you can just set one or two, like so:
CHMOD o=x yourcgi.cgi
CHMOD uo=rw
guestbook.html
The first one would obviously make the CGI you wrote available to users on the internet as an executable program, and the second would configure a file named "guestbook.html" as a read and write access file for yourself and internet users. The first one will tell the server that this file is a program instead of a text document. You MUST at least set this attribute on for the CGI itself. Well that's about all you need to know to get into the exciting world of CGI, and you can use a language you already know, C. The rest of what you do is up to you, as it is now your in your hands to develop what YOU want. Now start coding!